Question classification is one of the tasks in question answering system. Since questions often have rare words and colloquial expressions, especially in the application of voice interaction, the traditional text classifications perform poorly in short question classification. Thus a short question classification algorithm was proposed, which was based on semantic extensions and used the search engine to extend knowledge for short questions, the question's category was got by selecting features with the topic model and calculating the word similarity. The experimental results show that the proposed method can get F-measure value of 0.713 in a set of 1365 real problems, which is higher than that of Support Vector Machine (SVM), K-Nearest Neighbor (KNN) algorithm and maximum entropy algorithm. Therefore, the accuracy of the question classification can be improved by above method in question answering system.
Aiming at the problem that the traditional wavelet transform, curverlet transform and contourlet transform are unable to provide the optimal sparse representation of image and can not obtain the better enhancement effect, an image enhancement algorithm based on Shearlet transform was proposed. The image was decomposed into low frequency components and high frequency components by Shearlet transform. Firstly, Multi-Scale Retinex (MSR) was used to enhance the low frequency components of Shearlet decomposition to remove the effect of illumination on image; secondly, the threshold denoising was used to suppress noise at high frequency coefficients of each scale. Finally, the fuzzy contrast enhancement method was used to the reconstruction image to improve the overall contrast of image. The experimental results show that proposed algorithm can significantly improve the image visual effect, and it has more image texture details and anti-noise capabilities. The image definition, the entropy and the Peak Signal-to-Noise Ratio (PSNR) are improved to a certain extent compared with the Histogram Equalization (HE), MSR and Fuzzy contrast enhancement in Non-Subsampled Contourlet Domain (NSCT_fuzzy) algorithms. The operation time reduces to about one half of MSR and one tenth of NSCT_fuzzy.
Concerning the low accuracy of tagging Chinese ambiguity words, a combined tagging method of rules and statistical model was proposed in this paper. Firstly, three kinds of traditional statistical models, including Hidden Markov Model (HMM), Maximum Entropy (ME) and Condition Random Field (CRF), were used to tagging problem of the ambiguity words. Then, the improved mutual information algorithm was applied to learn Part Of Speech (POS) tagging rules. Tagging rules were got through the calculation of correlation between the target words and the nearby word units. Finally, rules were combined with statistical model algorithm to tag Chinese ambiguity words. The experimental results show that after adding the rule algorithm, the average accuracy of POS tagging promotes by 5%.